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CYP2S1 promoter sequence 10kB immediately upstream of the initiating 
ATG (start of coding sequence) 



-10,000bp 



AAAGGATGGG 


GTGAGGTGAT 


GGGGTGAGGA TGTAGGATAA 


TGGGACAGGA 


TGAGCAGTGG 


GATGAGGGGA 


TGGAATGAAG GACTGGATAA 


GGGATAGGTG 


GGGGTAAATG AGAGCATGGG GGAGGCAGTG CTCTCCTGAT GGTGGGGTGC 


ACgagtggat GGATGACAGG ATAAATAGGG AAGGGAGGAG 


GGATAGGATG 


ACGAGACGGC 


TGTAGAAGCC 


CAGAGCAGAG AACATTGCTG 


CTTTGGGGTC 


GATGATGTAA 


TCACCTCAAC 


TCACTGACAC TATTCCCAGC 


CACGGATGAT 


GCTCACAGAA 


TCTGGGGAAG 


TCCAAGGCCT GGAAGCAGGA 


CTCATCTTGG 


ACTTCCCCTT 


CTATCTAGTT 


ccaggtgcTG AATGAggcac ctctgaagaa 


GAGAAAGGAG 


AGAGACTAAG 


ATAAACAAGA CTGAGAGGAA 


AAAATCAGAG 


TGGGCAGGGA 


GAGTGAGCCT 


GGTAAAGTGG ACCACAGAGC 


AGACAGGCTG 


TGGCTTAGCC 


TTGGACAGCA 


GGTGGGGTTC CAGAGC CATA 


TGCTTGGAGG 


AGCCTTAGCA 


AACTAAATCC 


CCCAGCAGTT TCTTAAACCC 


ATCCATCACA 


CAGCTTGCCA 


GAACCCTGGG 


GTTGGCAGCT TCCAGAATGG 


TTAGGAAAAT 


CCAGAGTAGT 


GGTCAGGCGC 


GGTGGCTCAT GCCTGTAATC 


CCAGCACTTA 


GGGAAGCCAA 


GGCAGGCGGA 


tcactAGGTC Aggagatcga aaccatactg 


GTTAACACGG 


TGAAACCCCG 


TCTCTACTAA AAATACAAAA 


AATTAGCTGG 


GCATGGTGGC 


ATGCGCCTGT 


AATCCCAGCT ACTCGGGAGG 


CTGGGGCAGG 


AGAATCACTT 


GAACCCGGGA 


GGCAGATATT GCAGTGAGCC 


GAGATCGCGC 


CATTGCACTC 


CACCTGGGCA 


ACAGAGCGAG ACTCCGTCTC 


AAAAAAAAAA 


AAAGAAAGAA 


AGAAAAAGAA 


AATCCACAGT AGGGGGCCAG 


ACACAAAAAT 


GATCACTCCA 


GCACTGTCCA 


GCCCAGATCA GAGGGTTTCT 


GATGGGAAGT 



Figure 7 
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AGCTGGGGTC AGGGCAAGGA GTGGTGGAAA AAGTCAGGCT GTTTTCAGCT 
GAACTATACA AATGGGCATC TCCTGGCCCA GGGTGGGGAT TTGGGATTGC 
AGAAAGGCCA GAATCCACCT GGAATCACTC AGTTACTGTG AAATCTATCT 
TGGGAACCTA AGAATGTTTG CTTTCTAGAC TTGAGAATTT TGGACACTTG 
ATTGCTTTCT GGATGAATTT TAGAGATTTA TAAATTGTAT TGAAAGTGTT 
TATTCGACAA GATGTTTATT GAGCATCCAC AGTGTGTTAG GCACTGGGGA 
TACAGCAACA CACAAAACAG ACAGAGAATC GGCCCTTATG GAGAGACCAT 
TTCAGTGGGA AAAGGGAGTA TAAAAAAGCA AATCAAGGTC GGGAGCAGTG 
GCTCCCACCA GTAATCCCAG AACTTTGGGA GGCCGAGGCA GGTGGATCGC 
TTGAGCCCTG GGCAACATAG CTAAACCCTG TCTCTACAAA AAATTAGCCA 
GGCATGGAGC GCGTACCTGT AGTCCCAGCT ACTCAGGAGG TCGAGGCAGG 
AGGATCGCTG ACATCTGTGA GGCAGAGGCT TCAGTGAGCA GAGATCACAC 
TACTGCACTC CAGCTTAGGC AACAGAGCAA AACTCTGTCT TTAAAAAAAA 
AAAAAAGTAG GCCGGGCAGG GCCGGGCCCA GTGGCTCATG CCTGTAATAC 
CAGAACTTTG GGAGGCCAAG GTGGGTGGAT CACTTGAGTG AGGrCAGAAG 

TTCAAGACCA GCCTGGCCAA CATGGTGAAA CCCTGTCTCT ACTAAAAATA 
CAAAAATTAG CTAGGCATGG TGTCACATGC CTGTAGTCCC AGCTACTCAG 
GAAGCTGAGG CAGGAGTATC ACTTGAATCC AGGAGGCAGA GGTTGCAGTG 
AACGGAGATC ACACCACTGC ACTCCAGCCT GGGCAACAAG TGTGAGACTC 
CATCTCAAAA AAGAAAGTGA ATCAATATAT AAAATATAAA AAGACAAAAA 

ATAATACACG TTGGCAATGA TGTGGAGGAA AGGAAACATA CCCTGTTGGT 
GAGAATGTAA ATTAGTCCAG CCACTATGAG AAACAGTATG GAAATTTCTC 
AAAAAACTAT CATAAGGGCT GGGTGCGGTG GCTCACGCCT GTAATCCCAG 
CACTTTGGGA GGCCGAGGTG GGTGGATCAC AAGGTCAGGA GATCCAGACC 
ATCCTGGCTA ACACGGTGAA ACCCCGTCTC TACTAAAAAT ACAAAAAAAA 
AAAAAAATTA GCTGGCCATG GCG GCGGGCA C CTGTAGTCC CAGCTACTCA 
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GAAGGCTGAG GCAGGAGAAT GGCGTGAACC CAGGAGGCGA AGCTTGCAGT 

GAGCCGAGAT GGCACCACTG CACTCCAGCA TGGGCGACAG AGCAAGACTC 
CATCTCAAAA AACAAACAAA AAACAATCAT ATGATCCAGC AATCCCACTA 
CTGGGAATTT ATGGAAAGGA AAAGAAATCA GTGTATCAAA GGGATAGCTA 
CACAGCAATG TTTATTACAG CACTATTCAC AATAGCAAAG ATATGGAATC 
AACCTAAATG TCCATCAACA GATGAATGGA TAAAGAATAT GTGGTACATC 
TACACAATGG AAAACTATTT GGCCGTTAGA AAAAGAATAA AATCCTGTCA 
TTTGCAGCAA CATGTGAAAC TGTCTGTCCC TACAGGGTTG ACAAGAACTG 
CAAGCCAGGT T CTAGATAGA AATATAATTA AGCATTGGCT GGGCACAGTG 
GCTCACACCT GTAATCCCAG CACTTTGCGA GGCCGAGGTG GGCAGATCAC 
TTGAGGGCAG GTGTTCGAGA CCAGCCTGGC CAACATGGTG AAACCCTGTC 
TCTACTAAAA ATACAAAAAG TAGCTGGGTG TGATGGCAGG TGCCTGTAAT 
CTCAGCTACT TGGGAGGCCT AGGCAGAATT GCTTGAACCC GGGAGGCAGA 
GGTTGCAGTG AGCCGAGATC ATGCCATTGC ACTCCCAGCT TGGGTGACAG 
AGTGAGACTC AAAAAAAAAA AAAAAAAAGA AAAAGAAAGA AAGAAAGAAA 
ATTAAGCATT AATCATGCTG CACTTTGGTC CACTTCCTTG TTGCTGAAAG 
CCACATAGCT CTAGATGCTG ACCATTTGTA TCCCCATTGT TCTTATAGAC 
AGCATCGCTG ACCTTAGAAT CATGATGTTT TTGTTAAGGA TCACGTCAGA 
TGTTTTTTGG ACCCCCAATT CCAGCCACCA GTTTGAAGAC CCCTACAGAG 
GATGGGGATT GTCAGGCCTC TGAGCCCAAG CTAAGCCATC ACATCCCCTG 
TGACCTGCAC GTATACATCT AGATGGCCTG AAGTAACTGA AGAATCACAA 
AAGAAGTGAA AATGGCCTGT TCCTGCCTTA ACTGATGACA TTACCTTGTG 
AAATTCCTTC TCCTGGCTCA TCCTGGCTCA AAAGCTCCCC GGCTGAACAC 
CTTGTGACCC CCACCCCTGC CAGCCAGAGA ACAACCCCCT TTGACTGTAA 
TTTTCCACTA CCTACCCAAA TCCTATAAAA CGGCCCCACC CCTATCTCCC 
TTCACTGACT CTTTTTGGAC TCAGCCC GCC TGCAC CCAGG TGAAATAAAC 
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AGCCTTGTTG CTCACACAAA GCCTGTTTGG TGGTCTCTTC ACACGGACGC 
GAGTGAAAGG GATCAGCATG AGACTATAAC TTCTTCTTCC ACCCTCTGTC 
CCGTGACTTC ACTCTGCACT CTTCAACCAA TCAACGATCT CCACCCTTCA 
GCCCACTCCA AAACCCTTGA ACACCCTAGC CCCAAACTCT TAGGGGAGAT 
GGATGTGAGG TTTCCCCCCA TCTCCTCATT CAGTGACCCT ACAATTAAAC 
CTGCTTCTCT GCTGCAAACC AGTTATAACT GTAGTAGGCT CATTGCCCAG 
TGCACACAGC AAATCAACAG GAGACACTGG GTTGCAGGAG AGAAGAGGTT 
TCATCGTAGG GTCGCCAAAA GAGATGAGGA GTTGAAGAAT - GTAGGGTGAA 
GTCACGGGAC AGGGAGATGA AGAAGCCACA TTCTCATGCT GATCCCATTC 
CCCAGTGGGT AGCCTTCACA CTGGTTGCTG GAATTCAAGG TCTGAAAAGC 
ATCTTTTTAC ATTTTTGTTT ATGTATTTAT TATTATTATT ATTATTATTA 
TTATTATTAT TATTATTATT GAGATGGAGT CTCACTCTGT CTCCCAGGTT 
GGAGTATAGT GGTACAATCT CGGCTCACTG CAACCTCTGC CTCCTGGGTT 
CAAGCAATTC TCCTGTCTCA GCCTCCTGAG TAGCTGGGAT TACAGGTGTG 
AACCACCTCC CCCCACCACC TCCACTCCGC TAATGTCCTT TGTATTTTTA 
GTAGAGATAG GGTTTCACCA TGTTGACTGG GTTGATCTTG AAGTCCTGAC 
CTCAAGTGAC CTACCCACCT CAGCCTCCCA AAGTGTTGGG ATTATGGGTG 
TGAGCCACCG TGCCTGGCCC TGAAAAGCAT CTTAAGTGAT TCTTTCTTTA 
ACAAAAGCCT TATGACTCTA ATATCAGAGA TTCTGTCTAT AGGAACAATG 

gGGGTGCACa TGGTCAGTAT CTAGCTCTAC ctgagtttta gcaacaagga 

AATGGACCAA AGTGCAGCCC GAATAACACT TAATTATAAG TATGTTTCTG 
TCCAGAACCC AGCATGCAAT TCTTGTCAGC CCTGTGGGAA TGGTTTCACA 
GTGTCTCGAT ATACTGACTT GCTGTGTGCA TTGGGCAACA AACCTATTAC 

AATTACACAT GGATGTAACT GGAGGTCAT1 ACATTAATTG AAATAAGCCA 

GGCACAGAAA GATAAACAAT GCATGTTCTT ACTCCCAAGT GGAAGCTAAA 
AAAGTTGATC ACATGGAGGT AGAGAATGGA ATGATGGATA CTAGAGACTG 
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GGAAAGGCGT 


ATGGGTGGGG 


TGGGGTGGGG 


AGAGAGGTTG 


GTTAATACAC 


CTAGACAGAA 


GGAATAAGTT 


CCCTTTTTTT 


TTGAGACGGA 


GTCTCACTCT 


GTTGCCCAGG 


CTGGAAGGCA 


GTGGCACAAT 


CTCAGCTCAC 


TGCAACCTCT 


GCCTCTTGGG 


TTCAAGCAAT 


CCTCCTGCCT 


CAGCCTCCAG 


AGTAGCTGCG 


ATTACAGGCA 


CGTGCCACCA 


TAACCGGCTA 


ATTTTTTTTT 


TTTTTTCAGA 


CGGAGTCTCA 


CTCTGTCACC 


CAGGCTGGAG 


TGCAGTGGCA 


CAATCTCAGC 


TCACTGCAAG 


CTCCACCTCC 


CAGGTTCACG 


CCATTCTCCT 


GCCTCAGCCT 


GCCGAGTAGC 


TGGGACTATA 


GACGCCTGAC 


ACCACGCCCG 


GCTAATTTTT 


TGTATTTTTA 


GTAGAGATGG 


GGTTTCACCG 


CGTTAGCCAG 


GATGGTCTTG 


ATCTCCTGAC 


CTCGTGATCT 


GCCCGTCTCG 


GCCTCCCAAA 


GTGCTGGGAT 


TACAGGCGTG 


AGCCACCACG 


CCCGGCAAGA 


ACTTTTAAGT 


TTTCTTATCT 


ATAGGATGTT 


GCAATCATCA 


TCTTTAAACA 


TTAGACATGG 


AATCTTTATA 


ATAATCTTGC 


CATATATATA 


TATATATATA 


TATTTTTTTT 


TTTTTT'J."X*TT 


TTTTTTTTTT 


GACACTGAGT 


CTCACTTTAT 


CGCCCAGGCT 


GGAGTACAGT 


GGCACAATCT 


TGGCTCACTA 


CAACCTCCAC 


CTCCTGGGTT 


CAAGTGATTC 


TCCTGCCTCA 


GCCACCCCAG 


TGGCTGGGGA 


ctacagGCGT GCACcacc 


ATCCAGTTAA 


T TT T T TTTT 


TTTTTTGAGA 


CGGAGTCTCG 


TTCTGTCGCC 


CAGGCTAGAG 


TTCAGTGGGG 


AGATCTCAGC 


TCACTGAAAC 


CTCCGCCTTG 


TGGGTTCAAG 


CAAGCAATTC 


TCTGCCACAG 


CCTCCCGACT 


AGCTGGGATT 


AGAAGTGCCC 


ACCACCACGT 


CTGGCTAATT 


TTTGTATTTT 


TAGTAGAGAC 


GGGGTTTCAT 


CATCTTGGCC 


AGACTGGTCT 


TGAACTCCTG 


ACCCCGTGAT 






Jt\t\\3 1 AL X bbb 


ATTACAGGCG 


TGAGCCACTG 


CGCCTGGCCG 


GTTAATTTTT 


ATATTTTCAG 


TAGAGACAGG 


ATTTCACCAC 


GCTGGCTAGG 


CTGGTCTCAA 


ACTCCTGACC 


TCAGGTGATC 


CACCCGCCTT 


GGCCACTGTG 


CCTGGTCAAC 


AGTCTTTCTA 


TTTTTATTCT 


AGGCTGGAGA 


CCTTTGTCTC 


AAAAACAAAA 


CGAGAATGCT 


CCCTGGAGTC TGTACTGATC 


CCTCTTCCCT 


CCCACCGTAG 


ATTAGTTTTC 


TCCTTGCATT 


TAAAAAGACC 
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TTTCTGGCTG GCATCCAGTG AATGAATTGT GGAGGAGGGG GAGAAGGAGA 

GAGGAAGCAG GTAGCTCAGT GGGGAAGCTA CTGCAAAATC CTGGCAAGCA 

ATGACAGTAC CTTGAATCAG GGCTGGTGTC AGTTGGATCA GGGTGGTAGG 

GAAAGGAGGA AATGGATAAA TTTGGAATGT ATTTGGAGGT AGAGCCAGCA 
GGATTTGCTG ACAAACTGGG TTTAAAGTCA AAGAGAGAAA TCAAGGTTAA 
ACCTGACAAA TAAAAACAGA TGTGGTCTCA GGCGAGTAGA GACATTATGC 
AGAAAGACTA TTGCATCAGG GGGAAAGATG GCTGTAAAAA CAATGAACAA 
GACCAGAATC TGATAACCCA GAAGGATGTG TTGTCTAATG AAACTAATTT 

tttcccctcc tcctattttt tttttgagac ggagtttcac tcttgttgcc 
caagctggag tgcaatggcg cgatctcggc tcactgcaac ctccgcctcc 
cggattcaag cgattctcct gcctcagcct ccctgagtag ctgggattac 

a gGCATGCAC caccacacct ggctaatttt gtatttttag tagagacggg 

GTTTCACCAT GTTGGCCAGG CTGGTCTCAA ACTCCTGACC TCAGGTGATC 

TGCCCACCTC agcctcccaa agtactggga ttgcaggcat gagacaccgc 
gcccggcctc tcctatattt tgttgtcatc agcaagtgaa aagatggtga 
taccttttac agaggtaagg aaggaggtga gagaaagtat tcccagatgg 
ggtgggaagc tggtacagcc cactttgcag gaggtggggg aatcgggaat 
tcttttatat ccatgaagtt tgagatgtct gttagctctc ccaggggtag 
aacagaggga gcagataggc tcaaggttgg atttggaacg tcctagaaac 
cttccagaac aaggcaaagg aggaactgag aactggcatt tacttcatag 
caagagcgta tgagcctccc cacccctcct cctttggctt cagggcaccc 
ctggaatgtt agaggctaga atcaatgcta aagaagacca cagtcaagga 
ttccccagac tccagggagc actctggcta tgctcttgag agaaagggct 
ctggactaga atacaaattg caagattgca ggccgggggc agtggctcat 
gcctgtaatc ccagcactgt gggaggccga ggtgggtaaa ttgcctgagg 

TCAggaggtc gagaccagcc tggccaacat ggtgaaaccc caactctact 



WO 2004/091150 



PCT/GB2004/001453 



13/14 

AAAAATACAA AAGTTAGCTG GGAGTGGTGG TGGGCGCTTG TAATCCCAGC 
TACTCAGGAG GCTGAGGCAG GAGAATCACA TGAACCCAGG AGGCAGAGAT 
TGCAGTGAGC CAAGATCGTG CCACTGCACT CCAGCCTGGG CAACAGAGCG 
AGACCCTGTC TCAAAAAAAG ATTGTGAAAA TTCTAAGAAT CTAATTTTTT 
TTTTTTTTTT TTTTTTTTTG AGACGGAGTC TCGCTCTGTC GCCCAGGCTG 
GAGTGCAGTG GCGGGATCTC GGCTCACTGC AAGCTCCGCC TCCCGGGTTC 
ACGCCATTCT CCTGCCTCAG CCTCCCAAGT AGCTGGGACT ACAGGCGCCC 
GCCACTACGC CCGGCTAATT TTTTGTATTT TTAGTAGAGA CGGGGTTTCA 
CCATTTTAGC CAGGATGGTC TCGATCTCCT GACCTCGTGA TCCGCCCGCC 
TCAGCCTCCC AAAGAATCTA ATATTTTAAA ACTCCAGCAT ATGCAACTCA 
AAGCTCATCT AATTATACAC TTAAGAGTTA TGTATTTCAT TGTATATAAG 
ATACCTTGAG GAACAAAAAG TATCTGTAAA CAAATACTGA GCTCTAGCTA 
CTGATATGCA TGCTGATGTA TTTGGGAGTG AAGTGTACTG GTATCTGCAA 
CTGACTTTGA AATGCTTAAA AAAAAATCAA TGGATAGGCA AAATGAACAG 
ATATGTAATG AAAAAAGGGC CAGGCACAGT GGCTCATGCC TGTAATCCCA 
GCACTTTGGG AGGCTGAGGT GAGAAGATCA CCTGAGGTCA GGAGTTTGAG 

ACCAGCCTGG TCAACATGGC AAAAACCCCG TCTCTACTAA AAATACAAAA 
AATAGCCAGG CATGGTGGTG CACGCCTGTA ATCGCAGCTA CTTGAGAGGC 
TGAAGCAGGA GAATTGCTTG AACCCGGGAG GCAGAGGTTG CAATGAGCCA 
AGACTGTATG CTATTGCACT CCACTCTGGG CAACAGAGTG AGACTCTATC 
TCAAACAAAA AAAGAATAGA TAGGTAACAA AGAAAGGATA ATAAAATGGT 
GGAATCTGAT GGGTATACAG GTGTTCACTG TATGTTTGAA ATTTTAATAT 
TTTTATAATA AAATACGAAA TCAAAATGCA AAGCAAACAA AGTAACTGCC 
CTGCTGAAAA CCCCTCCAGA CAGCTTCCGT TTGCAATAGA AATAAAATGC 
AGTCTTTTCC AAGACCTTGC AAGATGGCCC CTGCAGACTT CATGAATCTT 
ATCTCCTACA TTCAATCTTT GAAAATATAA CAAATATTTA TTGAGTGCCT 
ACTGTGTGTC AGGCACTGCA CTATATGTTG GGGATACAGC AGTGAACCAA 
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ACCTACAATA ATTCATGCAC AAATGGTTCT AACATTCTTC CTTCTCAAGA 
AGCCCATTCC CCTGGCAGGC CTTCTCACTT CCTGAAATTG TTTTCATCTC 
ATCTTCATCT GCATGTGGAT ATGAGCTCAG GAAGGCCAGA GCCCTTCTCT 
CCATCGCTCA TAGCTGCACC CTCGGGTCTA GAACAAGGTC TTGTTTATAG 
TAAGACCTCA ATAACAGTTA GTTGCAGAAA TATTAGCTAA CATCTATTTA 
GTGCTTGTTA CATGTGCACC TGCGTCTGCT TTAAATG GTG TGCACA GATG 

TACTATTTAA TCCTGAAAAC TAGCCCCATT TTACAAGGCT AACAGAGGCA 
CAGAGGAATT AGGCTGTGAA TTCAAGCAGT CTGGCTTCAG CCATCACGGT 
CCTAACCGCT CTGCAAACTG CCTCTAAACG AATGAATGAA TGACTCAATC 
ATTCTAACTG GCCCAAGTCA AAATATTAAC TCTGCCACTC AGCCAAGTTG 
ACCTGGGGCC AACCGATTTA TCTCCCTGGA CTTCAGATTG TTTATCCTTA 

AAGGTCA&GT ccttcctctc agttcgtgaa gatgtaagat gctggagaaa 

ATTGACTCAA AGACACCCCA TACACGCACG CACACAGGAA CCCACATCCG 
AACAATGAGC GAGGTAGGAG CTGGGGAGAG TGGCGCGGGG TCACCGCCAG 
ATGCGGGGTG TGGCAACCTC CAGGCGGCCA AAGTCTCTGA CTTCCAGGTT 
CTTCTGTTTG CTTACTCCCT ATCCGGGGGC CCAAGGCGCT GTCTCCGCCG 
CCCAAGCCCC GCGTAAACCT GGGTGACCTC GGAGACATCC GTTGGAGCAT 
GAGTTCCCGA CATCAGGCGG CGGCGGTGGT CCGGGAGAAA CCCGGCGGCG 
GGGAGATAAG CCTGCCCAGG AGGCAGGGGG CTGGGCTAGC TGCCCCGCCC 
CGCGCCTGAC TTCGTTGGGG AGGGAGACGC CCGGCTCCCG CCCCTAACTA 
GCCCAGCCGC GCGGAGCGCC TGGGAGAGGA GAAGGAGCCG ACCTGCCGAG (-1) 

ATG 



